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Abstract. A new lower bound on the error probability of maximum likelihood decoding of 
a binary code on a binary symmetric channel (BSC) was proved in Barg and McGregor (2004, 
|cs. IT/040701 1 1. It was observed in that paper that this bound leads to a new region of code rates 
in which the random coding exponent is asymptotically tight, giving a new region in which the re- 
liability of the BSC is known exactly. The present paper explains a relation of these results to the 
union bound on the error probability. 



1. Introduction 

This is a companion paper to [6|. Suppose that a code C is used on a BSC(p) and decoded 
according to the maximum likelihood procedure. The error probability of decoding Pe{C,p) can 
be estimated from above using the distance distribution of C together with the union bound. As 
a general rule of thumb, this bound gives a good estimate of the error probability for low channel 
noise and is loose for high noise. Quantifying this heuristic is a difficult problem related not just to 
the distance distribution but also to structural properties of the code. Rigorous results are attainable 
only in the asymptotic setting when the code length n tends to infinity (therefore in effect we will 
study families of codes rather than individual codes without always saying so). The inaccuracy of 
the union bound is related to the fact that intersections of half-spaces related to codewords other 
than the transmitted one, are counted more than once. It turns out that under certain conditions 
adding the measure of these intersections does not change the exponential asymptotics of the actual 
value of the error probability. The first result of this type was obtained by Gallager lfT?ll who 
proved that for the ensemble of random codes and for rate R < Rent, where i?crit is the so-called 
critical rate of the channel (see below), the union bound gives the correct exponent of the average 
error probability for this ensemble (this quantity is different from the error probability of a typical 
random code, and both are different the error probability of decoding for a typical linear code, 
see O). The proof in Ifl^l is based on the fact that the error probability of decoding into a list of 
size two decreases exponentially faster than the estimate of Pe{C, p) given by the union bound. A 
similar result can be proved for the ensemble of random linear codes using the ensemble-average 
coset weight distribution. 

Subsequent results of this type are substantially more involved. They are related to universal 
bounds on the distance distribution of codes |[T6l [H and rely upon various methods of proving 
lower bounds on Pg given the distance distribution. One such method, due to |15|, was used in 
iri6 . 2 1 to prove new estimates of the reliability function of the BSC L16 1 and the power-constrained 
AWGN channel |2|. Other methods known are due to |l8l|9l| and ifTOl . The main question addressed 
by this analysis is the value of the code rate R^ such that for rates R < R^, the union bound can be 
claimed to be exponentially tight. 
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The paper is organized as follows. In Sect.|2lwe discuss the problem statement. Sect.|3lis devoted 
to general lower estimates of the error probability Pe{C,p) given the distance distribution of the 
code C. In Sect.lHwe study the relation between the random coding exponent (the exponent of the 
error probability for a typical linear code) and the union bounds on this probability. Our context is 
that of geometry of decoding of random linear codes. We explain how different bounds on codes 
are related to the union bound on the error probability. Then in Sect. 15] we put everything together 
and show that a part of the random coding exponent just below the critical rate of the channel gives 
the actual value of the channel reliability. Some concluding remarks are presented in the final 
Section |6l 



2. Statement of the problem 

We consider transmission with binary codes of length n over a BSC with crossover probability 
p. Let X = {0, 1}" be the n-dimensional Hamming space. Let C{n, M = 2^") C X be a code of 
rate R and let Xj G C be the transmitted vector. Under this condition the probability that a vector 
y is received equals P{y\xi) = — p'j"'-\y+^i\ ^ where | ■ | is the Hamming weight. 

Let D(x) be the decision region of max-likelihood decoding for a codevector x. Given that Xi is 

transmitted, the error probability of maximum likelihood decoding equals Pe(a^i) = Pi iX\D{xi)) . 
The (average) error probability of decoding for the code C equals 

1 ^ 

i=l 

Computing this probability directly is prohibitively difficult in most nontrivial examples, therefore, 
there has been much interest in bounding it from both sides. As in |l6|, we focus on lower bounds 
on Pe{C, p). For a given code sequence, define its error exponent as 

E{p) = lim - log ^ ■ 

n->oon Pe{C,p} 

We will also apply the results of the paper to the largest attainable exponent of the error probability 
of decoding defined as 

i?(-R, ») = limsup — log max — -— — -. 

n^oo n CCX,R(C)=R Pe{C,p) 

This quantity is also called the reliability function of the BSC. 

Let us fix an arbitrary ordering of the codewords. Define the local distance distribution of the 
code C with respect to the codeword Xj. This is a set of n + 1 numbers Bq, . . . , 5^, . . . , Bl^, 
where 5^ is the number of neighbors of Xi in the code at distance w. Below we will mostly 
concentrate on lower bounds on the probability Pe{xi) given the local distance distribution. We 
will consider codes of exponentially growing size for which the error probability Pe{C, p) declines 
exponentially fast. In this situation, given the average distance distribution of the code C, we can 
isolate a subcode of the same exponential order in which the local distance distribution for every 
codeword is asymptotically the same as the average distribution. Therefore, the bound Pe{xi) can 
be used to obtain a bound on Pe{C, p) with the same exponent. This argument is presented in detail 
in 11^121, so we will rely on it here without further discussion. 



Notation. Let C = {xi, . . . , xm} be a code. For a subset F c X let 

n (r) = J]p(y|xi). 

y€Y 

Let 7r(ti?) be the error probability for two codewords at distance w, i.e., the probability of transmit- 
ting Xi and decoding Xj, where d{xi, xj) = w and d{-, •) denotes the Hamming distance. By the 
union bound, 



n 



(1) Pe(x.)<$^fi>H. 



w=l 



Letting 7r{ujn) = 2"^('^)+°("), we have A{uj) = u log2^p(l -p). Then 

(2) -log-l->-AM-/iH, 

where = MogS^. 

By we denote the binary entropy function. We also use the divergence D{x\\y) = h(x) + 
X logy + (l — x) log(l — y) (the logarithms are binary). 

Bounds on codes. Define 

^(fi) = li,„sup ma. '-^ 

n~*OD C:|C|=2"« n 

There exist code sequences (for instance, typical codes from the ensemble of random linear codes) 
whose relative distance approaches the quantity 6gw{R) = h~^(l — R) which is called the Gilbert- 
Varshamov (GV) distance. Thus, 

S{R) > SoviR). 

On the other hand, by the Elias bound, 

6{R) < 6e{R) := 26oy{R){1 - 6aw{R)) 

where the quantity 5e{R) is sometimes called the Elias distance. A better upper estimate of 5{R) 
is provided by the JPL bound lfT7l : 

5{R) <5:= min G(a,r) 

0<a<| 

where r) = 2°^^~"^r^^^~^\ and where r satisfies /i(r) = 1- R- h{a). For < i? < 0.305 

l+2^r(l-r) 

this bound takes a simpler form: 6 = (j){h^^{R)), where (j){x) = | — ^/xiY^^^x). Denote by R{6) 
the inverse function of 5{R) which is well defined because 5 is a monotone decreasing function of 
R. 



3. Lower bounds on PeiC,p) 

In this section we review the known lower estimates of the probability Pe{xi) given the local 
distance distribution of the code. Let C{i) = {x E C : d{x, Xj) = w} for some fixed value of w. 
Given two different vectors xi, Xj E C, let 

Xij C Xij := {y e X : d{xj,y) < d{xi,y)} 

be an arbitrary subset. 



3.1. Kounias' bound ESI. This (obvious) bound states that 

Xj£C{i) XkGC{i)\{xj} 

k<j 

In principle, here and hereafter C{i) can be an arbitrary subcode of C that does not contain Xj. 

3.2. Burnashev's method IHIIIIEI. This method was originally suggested for the AWGN chan- 
nel and was adapted to the BSC in 0. The error probability of decoding is estimated by carefully 
taking account of the probability of the subsets XijCiXi^, k ^ j for Xj, G C [i] and for some suit- 
able definition of the subsets X^j. Let Xj, Xj, x^ G C{i), d{xi, Xj) = d{xi, x^) = ujn, d{xj, Xk) = 
\n. Let 

(3) Xij = {y eX : d{xi, y) = d{xj, y) = ^+ pn{l - u)}. 
Denote by Bioj, A) the negative exponent of the probability Pj {Xik\Xij) , 

(4) B{uj,X) = -uj-{l-uj)h{p) + 

^ ( Aftf^) + (. - A/2)/, (^) + (1 - . - XI2)h (f " ^^-J' 

77G[^,mm(|,p(l-c^))] V \2uJ - XJ \l-UJ-X/2 

The main result of |l6| is given by 



Theorem 1. [6| Let (Cj)i>i be a sequence of codes with rate R, relative distance 5 and distance 
distribution satisfying B^n > 2'^^'^'^^~''^^\ where I3{uj) > for all S < to < I. The error probability 
of max-likelihood decoding of these codes satisfies Pe{C,p) > 2~^"+°("), where 

(5) E = mm max \ max{- p (lu) - A{iu), B{iu, X) - A{X))]. 

S<ui<l 0<X<uj 

As it turns out, for sufficiently low code rates R, the first term under the maximum in Q domi- 
nates the estimate. This shows that for code rates R < R* the union bound is exponentially tight, 
where R^ is some value of the rate than depends on the distance distribution of the code and on the 
noise level in the channel. We will study the values of i?* in Sect.|5]for the problem of bounding 
the channel reliability function. 

3.3. The method of Cohen and Merhav: de Caen's inequaUty and its generaUzations. D. de 

Caen [ 1 1 1 suggested a new lower bound on the probability of a finite union of events. While 
an elementary result (essentially, Cauchy-Schwarz), this bound is sometimes the best among the 
inequalities of this type. De Caen's inequality was used to compute lower bounds on the error 
probability via the distance distribution in lfTFlfT?ll . Cohen and Merhav (TU\ generalized de Caen's 
inequality by introducing a weighting function that depends on the weight of the error vector and 
derived a lower bound on Pe{C,p) by optimizing on this function. Their result can be stated as 
follows. 

Theorem 2. [.lOJ Let Xj,Xk G C{i) be arbitrary vectors, j ^ k. Then 

r 1 2 

Bl 



(6) PeiXi) > 



E Piy\x^H\y\) 



E Piy\x,)v^{\y\) + iBi-i) P{y\x^dv'{\y\y 

yeXij y&XijHXik 



where ri{-) is an arbitrary weight function. 



Taking C (i) to be the set of neighbors of Xi at the minimum distance d, paper [TOl obtains a 
bound on Pe{xi) formed of two pieces. Similarly to Theorem[Tl Theorem implies that for low 
rates the exponent of Pe{xi) asymptotically coincides with the exponent of the union bound. The 
condition on the code rate for the union bound on Pe{xi) to be (exponentially) tight proved in ifTDl 
Prop. 5.3] can be written as follows; 

(7) s^p,(x,,nx,fc) <p,(x,,), 

where Xj,Xk E C{i) are arbitrary (different) codewords and < refers to an inequality for the 
exponents ^ 

4. Decoding geometry of random linear codes and the union bound 

4.1. Decoding of random linear codes. Consider the ensemble of linear codes defined by {n — 
k) X n parity-check matrices with independent random components chosen with equal probabil- 
ity from {0, 1}. Let R = k/n. The ensemble-average weight distribution has the form = 
2n{R+i-h{ui)) ^ ^ _ (1 / n) , . . . , (u — l)/n,l. The minimum relative distance 5 of a typical code 
from the ensemble approaches the Gilbert- Varshamov bound 5gy{R) = h^^{l — R). Computing 
the error probability Pe{C) for such a code, we obtain an upper bound on the BSC reliability of 
the form E{R, p) > Eo{R, p), where Eq{R, p) is the "random coding exponent," 



-6ow{R) log2 2Vp(l-p) 0<R<R„ (a) 

(8) Eo{R,p) = { D{po\\p) + i?crit -R R.<R< Rent, (b) 

D{6gv{R)\\p) Rent <R<1- h{p), (c) 

where 

(9) R, = l-hM 

(10) i?crit = l-/i2(po) 



(11) Po = ^ — 7^=' Wo := 2po(l - Po) - 



This is a classical result of coding theory due to P. Elias and R. Gallager. Concise, self-contained 
proofs that are suitable for our context appear in [5 , 4J. 

A part of this result that is used below is related to the typical weight tOtypU of the incorrectly 
decoded codeword in the case of decoding error^. For the cases (a)-(c) of dHJ) the values of cjtyp are 
as follows Q: 

(a) utyp = SoviR) 

(b) u;typ = ujo 

(c) utyp = Se{R). 

In Fig.[l]the bound Eq(R, p) is shown together with the values cjtyp as a function of the code rate 
R. As R varies between R^ and -Rcrit, the value of cutyp = ujq changes its location with respect to 

^Note that Q relies on Xij instead of Xij. The reason for this is explained in the end of Sect.|5]below. 

^he expression for Pe (C) is a finite sum of binomial-type probabilities. Asymptotically for large n it is dominated 
by weights of incorrectly decoded codewords in a small segment around some value, which is called a typical weight 
of incorrect codewords. 
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Figure 1 . The typical weight of incorrect codewords and the random coding ex- 
ponent for a BSC withp = 0.08. 

the minimum distance of the code, moving from 5gw{R) to SsiR)- We note that cutyp < ^EiR) as 
long as i? < i?crit- 

4.2. Weight distributions and the union bound on Pc{xi). It is conjectured that Eo{R, p) gives 
an exact value of E(R,p) for all R E [0, 1 — h(p)]. In an attempt to prove this, various upper 
bounds on E{R,p) were established. The tightest known upper bounds are proved by showing 
that an appropriate version of the union bound in effect is tight (entails no loss of accuracy of the 
estimate for large n). 

The weight profile (the exponent of the weight distribution) of a typical random linear code of 
rate R has the form R+1 — h(uj),uj > Sgw{R). As explained above, only the weights in the region 
5gy{R) < to < 5e{R) are relevant for the random coding exponent. Let us assume for a moment 
that 

(A) for any code C, a given codeword Xi has at least codeword neighbors at relative 
distance uj = g{R) were g is some monotone decreasing function; 

(B) the union bound gives a tight value of the error exponent in the estimates ^ and/or ® for 
some region of low rates, to be specified later. 

By (B), we can write an asymptotic estimate of Pe{xi) using ([T]in the reverse direction. Substituting 
the distance distribution from (A) we would be able to state an upper bound on E(R, p) of the form 



(12) 



E{R,p) < -(i? - 1 + h{g{R))) - A{g{R)). 
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Figure 2. Bounds on the error exponent for the BSC withp = 0.08. In the interval 
-Ri < -R < -Rcrit the random coding bound Eo(R, p) is tight. A discrepancy between 
upper and lower bounds on E{R,p) remains for rates in the interval < R < Ri. 

For instance, if (A) were true for cu = (5gv(-R) then we would obtain ([5^) as an upper bound on 
E(R,p) (this is a very strong assumption because it implies that the GV bound is tight). In this 
case g{R) = 6gw{R)- 

We will assume that g{x) is such that the function — (i? — 1 + h{g{R))) — A(g(R)) is U-convex 
(this will be the case in all our examples). 

Two important remarks should be made with respect to this argument and Fig.|2l We formulate 
the first one as 

Lemma 3. The function on the right-hand side ofUTti is tangent to the straight line D{pq\\p) + 
Rcrit — Rat the point Ri = g^^{ujtyp). 

Thus if g^^{ujtyp) < -Rcrit, the random coding bound Eo{R,p) of (EJ) gives an exact answer for 
the channel reliability E{R,p) at the point R = Ri. Furthermore, together with the straight-line 
principle of 1 19| this implies that E(R,p) = Eq(R,p) for all rates Ri < R < -Rent- A result of this 
type will be proved in the next section. 

Secondly, if cutyp = ^^(-R) then it turns out that almost every error vector from the sphere of 
typical errors leads to a decoding error (see e.g., |4|). Therefore, for R > _Rcrit instead of (fT^ we 
compute a "union bound" of a different type, namely, the probability of an error vector of weight 
<^Gv(-R) occurring in the channel. This argument is not related to the above assumptions and gives 
(Et) as an unconditional upper bound on E{R, p) (the sphere-packing bound). 



In this section we study an application of the above ideas to bounds on the function E{R,p). 
Recently linear programming was used to derive bounds on the distance distribution of codes 
ifT^ni. In particular, paper lIT^ proves the following lower bound on the distance distribution of 
an arbitrary code family of rate -R. 



5. Reliability function of the BSC 



Theorem 4. fl61 For any family of codes of sufficiently large length and rate R and any a G 
[0, 1/2] there exists a value c<j, < < G{a, r) such thatn^^ XogB^^n > ^) — where 

fi(R, a,uj) = R-l + hir) + 2hia) - 2q(a, r, uj/2) - uj - (1 - uj)h( ^~^^'^ ) , 

V 1 — UJ / 

T = h~^{h{a) — 1 + -R), and where 

q{a, r, u) = h{r) + [ dy log(P + ^P^ - 4Qy^)/2Q, 
Jo 

where P = a{l — a) — t{1 — r) — y{l — 2y), Q = {a — y){l ~ a — y) , is the exponent of the Hahn 
polynomial H^^iujn). 

This theorem was used in lfT6ll to tighten the upper bound for E{R,p) for low rates, giving 
implicitly a condition for the union bound to be tight for low rates. Using this result together with 
Theorem [H we observe that there exists a value of the rate R = Rq, a function of p, such that for 
< R < Rq, the first termunder the maximum in (jSj is greater than the second one. The following 
statement was proved in ||6| . 

Theorem 5. Let -R(2po(l — Po)) ^ -Ro? where po defined in HTl . Then 
(13) E{R,p) < -A{6) - R+l-h{6) < R < Ro 



(14) E{R,p) < max max B{uj, A) - A{\) Rq < R. 

0<X<5 X<ui<5 

Explicit optimization in (fT4ll is difficult because of the cubic condition on the optimal value of 
the parameter in Q and for other similar reasons; however, the bound can be computed for a 
given p. Observe that by (fT3t . for R < R^ the BSC reliability E{R, p) is estimated from above by 
the exponent of the union bound. From Lemma |3l the bound (1131 is tangent on the straight-line 
part of Eo{R,p). 

It is clear that Ri < simply because 6{R) < ^^(-R), i.e., the JPL function is less than 
the Elias distance. Observe that for p > 0.04, the value Ri < 0.287 (and for p > 0.05 even 
-Rcrit < 0.305 ). For rates in this region we have 5 = (f){h^^(R)), and then the point of tangency is 
given by Ri = 0(/i(co'typ)) (since (j) = (p'^). 

Now to ensure that E{Ri,p) = Eo(R, p) it remains to show that the union bound exponent can 
still be claimed an upper bound on E{R, p) for R = Ri, or that Ri < i?*. This can be verified by 
computing the bounds ([T3t - (fT4l) and the value of R^. The computation leads to the following result 
(see also Fig.|2l). 

Theorem 6. Let p, 0.046 < p < 1/2 be the channel transition probability. Then the channel 
reliability E{R,p) equals the random coding exponent Eo{R, p) for Ri < R < Rcrit- 

Previously the bound Eq{R,p) was known to be tight only for the rates R E [Ran, 1 — h{p)] 

Given the rate R and the distance distribution of the code, the value of i?* is determined uniquely. 
Based on the computational evidence, the union bound can be claimed exponentially tight (under 
the approach of this section) if the code rate satisfies ©. Observe that Theorems 1 1151 lead to 
the same result because of our particular choice of the subsets Xij. Another possibility is to take 
Xij = {y E X : d{xj,y) < d{xi,y)} in which case these theorems would give a weaker result 
than lITDIl (this is the essence of the discussion in flO. p. 301]). The region X^j in Theorem|21is also 
suboptimal, but the correction term ?7(-) performs a transformation to the optimal region X^j. 



6. Concluding remarks, conjectures 



The method of this paper and |6| still stops short of proving that E{Ro,p) is tight for all rates 
Rx ^ R ^ -Rcrit- The crucial elements of the argument made above are (a) the fact that the 
JPL bound S{R) is better than the Elias bound and (b) the straight-line principle of [19J. Further 
progress can be related either to an improvement of bounds on codes, which at present looks very 
difficult, or to new ideas for extending a known bound on E{R, p) for low rates. 

We remark that the arguments and results similar to those obtained here for the BSC can be also 
obtained for a power-constrained AWGN channel. They are briefly discussed in [ 6 1 . The geometric 
picture that describes the relation of the random coding bound and the union bounds in this case is 
qualitatively the same as that of Sections HI 13 

If the GV bound is tight, then so is the bound Eq(R, p) on the channel reliability. The converse 
claim, i.e., the implications of the (putative) tightness of Eq(R,p) for bounds on codes, is not so 
obvious. To be more precise, the following question seems open. 

Open problem 1. Assuming that the bound ([SJ?) gives an exact value of E{R,p) for all R in the 
interval (R^, -Rcrit), is it possible (with the current knowledge) that there exists a sequence of codes 
whose minimum distance asymptotically exceeds the GV distance? 

This is certainly not true for code sequences in which the number of codewords of minimum weight 
grows subexponentially in n; however, there exist codes with exponentially many minimum- weight 
vectors 0. A weight distribution that might support a positive answer to the above open problem 
is of the form 

= 0<u;<S 
Bu^n > 2""(") 6 < u, 

where 6 > 5gv and a{uj) > R + 1 — h{uj). Note that the weight distribution of the code family 
whose existence in proved in [3| is not of this form and its distance is less that 5q\. If the answer 
to this problem is positive, this should not be very difficult. 

Given that an upper bound on E{R^ p) for some rate Rq, the straight-line bound of ifT^ gives a 
method of obtaining upper bounds on E{R, p) for rates R> Rq. 

Open problem 2. Given an upper bound on E{R^ p) for some rate R = Rq find a way of obtaining 
upper bounds for R < Rq. 

This problem presently seems difficult. 

So far the results for the reliability of the BSC and general discrete memoryless channels 
(DMCs) have been similar. However, apart from straightforward generalizations, it is not clear 
how to extend the result of this paper to DMCs. Therefore, let is formulate 

Open problem 3. Prove that the random coding bound on the reliability function of a DMC is tight 
for rates immediately below i?crit- 

Given the similarity of results for a particular distance distribution of Sect.|5]obtained by the meth- 
ods of [illi ilOl and f^'^, another open question that arises is whether the lower bounds of |9| and 
ifTm are generally related. If this is indeed the case, then the approach of 1 10| would give a more 
direct alternative to the successive refinement of the estimate of Pe{xi) performed in 0. This 
would also have consequences in the more general context of hypothesis testing 
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